Orchid species richness

This dataset is looking at the species richness of orchid flora and their response to 30 different environmental factors in 3500 grid cells in the Beipan River Basin in Guizhou Province. For this exercise I selected species richness as the response variable and mean annual temperature, mean annual precipitation, elevation, human population density, and area of cropland as predictor variables.

Figure 1. Dendrobium officinale
Figure 1. Dendrobium officinale
Figure 2. Chinese Ground Orchid (Bletilla striata)
Figure 2. Chinese Ground Orchid (Bletilla striata)

Colinearity

We have a high correlation between Mean Annual Temperature and Mean Annual Precipitation, as expected. There is relatively low correlation between the other variables.

model_1 <- lm(richness ~ MAT + MAP + elevation + population + cropland, data=orch)
anova (model_1) #coefficients of the full model
## Analysis of Variance Table
## 
## Response: richness
##              Df Sum Sq Mean Sq F value    Pr(>F)    
## MAT           1  37910   37910 172.528 < 2.2e-16 ***
## MAP           1  15221   15221  69.272 < 2.2e-16 ***
## elevation     1  83989   83989 382.232 < 2.2e-16 ***
## population    1   7816    7816  35.570 2.706e-09 ***
## cropland      1  47670   47670 216.945 < 2.2e-16 ***
## Residuals  3493 767528     220                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multicolinearity Check

## # Check for Multicollinearity
## 
## Low Correlation
## 
##        Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
##         MAT 1.75 [1.68, 1.84]         1.32      0.57     [0.54, 0.60]
##         MAP 1.76 [1.68, 1.85]         1.33      0.57     [0.54, 0.59]
##   elevation 1.19 [1.15, 1.24]         1.09      0.84     [0.81, 0.87]
##  population 1.07 [1.04, 1.12]         1.04      0.93     [0.89, 0.96]
##    cropland 1.10 [1.07, 1.15]         1.05      0.91     [0.87, 0.94]

Since all variables have a Variance Inflation Factor less than 5, we can conclude that there is a low correlation between the predictor variables.

Model Performance

Possible Models

## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population + 
##     cropland, data = orch)
## ---
## Model selection table 
##     (Intrc)      crpln   elvtn     MAP      MAT      ppltn df    logLik    AICc  delta
## 32 -55.2900 -2.239e-06 0.02160 0.07295 -0.48660 -0.0002350  7 -14395.89 28805.8   0.00
## 24 -51.3200 -2.183e-06 0.02068 0.06371          -0.0002335  6 -14399.45 28810.9   5.10
## 16 -58.3700 -2.320e-06 0.02247 0.07476 -0.47980             6 -14403.56 28819.1  13.33
## 8  -54.4400 -2.264e-06 0.02156 0.06563                      5 -14407.00 28824.0  18.20
## 28  13.3600 -2.282e-06 0.01587          1.01500 -0.0003007  6 -14480.54 28973.1 167.29
## 12  11.5800 -2.387e-06 0.01682          1.07200             5 -14492.57 28995.2 189.34
## 23 -68.2500            0.02389 0.07252          -0.0003535  5 -14501.49 29013.0 207.18
## 31 -69.2700            0.02412 0.07469 -0.11170 -0.0003546  6 -14501.31 29014.6 208.83
## 20  29.5600 -2.494e-06 0.01653                  -0.0003345  5 -14504.63 29019.3 213.46
## 7  -74.0300            0.02543 0.07599                      4 -14518.10 29044.2 238.39
## 15 -74.7700            0.02560 0.07757 -0.08047             5 -14518.00 29046.0 240.21
## 4   28.5800 -2.625e-06 0.01762                              4 -14519.39 29046.8 240.97
## 30 -20.5700 -2.550e-06         0.04409  0.26400 -0.0003952  6 -14520.75 29053.5 247.72
## 22 -22.0200 -2.590e-06         0.04874          -0.0004001  5 -14521.80 29053.6 247.80
## 14 -23.4700 -2.711e-06         0.04519  0.32840             5 -14541.44 29092.9 287.09
## 6  -25.3200 -2.764e-06         0.05101                      4 -14543.04 29094.1 288.28
## 26  19.5600 -2.523e-06                  1.14400 -0.0004109  5 -14553.40 29116.8 311.00
## 10  17.5900 -2.690e-06                  1.23400             4 -14575.36 29158.7 352.91
## 18  38.2000 -2.775e-06                          -0.0004543  4 -14582.82 29173.6 367.83
## 27   0.7777            0.01830          1.43400 -0.0004242  5 -14584.92 29179.9 374.04
## 11  -2.6150            0.01982          1.54300             4 -14607.83 29223.7 417.86
## 2   37.6300 -2.983e-06                                      3 -14609.40 29224.8 418.98
## 19  22.8600            0.01959                  -0.0004906  4 -14631.68 29271.4 465.57
## 29 -32.0600                    0.04221  0.79920 -0.0005552  5 -14649.28 29308.6 502.76
## 21 -37.1700                    0.05667          -0.0005783  4 -14658.44 29324.9 519.08
## 3   20.8600            0.02148                              3 -14661.83 29329.7 523.84
## 25   6.4890                             1.63600 -0.0005686  4 -14677.13 29362.3 556.45
## 13 -37.2900                    0.04363  0.94080             4 -14688.01 29384.0 578.21
## 5  -43.6000                    0.06085                      3 -14700.50 29407.0 601.20
## 9    2.4620                             1.81000             3 -14717.12 29440.3 634.44
## 17  32.4000                                     -0.0006570  3 -14735.28 29476.6 670.75
## 1   30.9100                                                 2 -14787.60 29579.2 773.39
##    weight
## 32  0.927
## 24  0.072
## 16  0.001
## 8   0.000
## 28  0.000
## 12  0.000
## 23  0.000
## 31  0.000
## 20  0.000
## 7   0.000
## 15  0.000
## 4   0.000
## 30  0.000
## 22  0.000
## 14  0.000
## 6   0.000
## 26  0.000
## 10  0.000
## 18  0.000
## 27  0.000
## 11  0.000
## 2   0.000
## 19  0.000
## 29  0.000
## 21  0.000
## 3   0.000
## 25  0.000
## 13  0.000
## 5   0.000
## 9   0.000
## 17  0.000
## 1   0.000
## Models ranked by AICc(x)

There are 32 possible models based on additive combinations of variables (no interaction terms).

Best Models

## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population + 
##     cropland, data = orch)
## ---
## Model selection table 
##    (Intrc)      crpln   elvtn     MAP     MAT      ppltn df    logLik    AICc delta
## 32  -55.29 -2.239e-06 0.02160 0.07295 -0.4866 -0.0002350  7 -14395.89 28805.8   0.0
## 24  -51.32 -2.183e-06 0.02068 0.06371         -0.0002335  6 -14399.45 28810.9   5.1
##    weight
## 32  0.928
## 24  0.072
## Models ranked by AICc(x)
## Global model call: lm(formula = richness ~ MAT + MAP + elevation + population + 
##     cropland, data = orch)
## ---
## Model selection table 
##    (Intrc)      crpln  elvtn     MAP     MAT     ppltn df    logLik    AICc delta weight
## 32  -55.29 -2.239e-06 0.0216 0.07295 -0.4866 -0.000235  7 -14395.89 28805.8     0      1
## Models ranked by AICc(x)

Variable Importance Weights

##                      cropland elevation MAP  population MAT 
## Sum of weights:      1.00     1.00      1.00 1.00       0.93
## N containing models:   16       16        16   16         16

We can see that area of cropland, elevation, mean annual precipitation, and human population density are all equally important. Mean annual temperature is less important, but it should still be considered.

Model Averaging

## 
## Call:
## model.avg(object = dredge_1, revised.var = TRUE)
## 
## Component models: 
## '12345'  '1235'   '1234'   '123'    '1245'   '124'    '235'    '2345'   '125'    '23'     
## '234'    '12'     '1345'   '135'    '134'    '13'     '145'    '14'     '15'     '245'    
## '24'     '1'      '25'     '345'    '35'     '2'      '45'     '34'     '3'      '4'      
## '5'      '(Null)'
## 
## Coefficients: 
##        (Intercept)      cropland  elevation       MAP        MAT    population
## full     -55.00649 -2.235242e-06 0.02153479 0.0722874 -0.4514571 -0.0002345950
## subset   -55.00649 -2.235242e-06 0.02153479 0.0722874 -0.4866271 -0.0002348973
#summary(model.avg(dredge_wash)) # if you want to average across all models, both competitive and non-competitive
summary(model.avg(dredge_1, subset = delta < 6)) # if you just want to look only at competitive models, which
## 
## Call:
## model.avg(object = dredge_1, subset = delta < 6)
## 
## Component model call: 
## lm(formula = richness ~ <2 unique rhs>, data = orch)
## 
## Component models: 
##       df    logLik     AICc delta weight
## 12345  7 -14395.89 28805.82   0.0   0.93
## 1235   6 -14399.45 28810.92   5.1   0.07
## 
## Term codes: 
##   cropland  elevation        MAP        MAT population 
##          1          2          3          4          5 
## 
## Model-averaged coefficients:  
## (full average) 
##               Estimate Std. Error Adjusted SE z value Pr(>|z|)    
## (Intercept) -5.500e+01  5.822e+00   5.824e+00   9.445  < 2e-16 ***
## cropland    -2.235e-06  1.526e-07   1.527e-07  14.639  < 2e-16 ***
## elevation    2.153e-02  1.361e-03   1.362e-03  15.812  < 2e-16 ***
## MAP          7.229e-02  5.966e-03   5.968e-03  12.112  < 2e-16 ***
## MAT         -4.515e-01  2.163e-01   2.163e-01   2.087   0.0369 *  
## population  -2.349e-04  6.001e-05   6.003e-05   3.913 9.11e-05 ***
##  
## (conditional average) 
##               Estimate Std. Error Adjusted SE z value Pr(>|z|)    
## (Intercept) -5.500e+01  5.822e+00   5.824e+00   9.445  < 2e-16 ***
## cropland    -2.235e-06  1.526e-07   1.527e-07  14.639  < 2e-16 ***
## elevation    2.153e-02  1.361e-03   1.362e-03  15.812  < 2e-16 ***
## MAP          7.229e-02  5.966e-03   5.968e-03  12.112  < 2e-16 ***
## MAT         -4.866e-01  1.825e-01   1.826e-01   2.665   0.0077 ** 
## population  -2.349e-04  6.001e-05   6.003e-05   3.913 9.11e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#is the point of model selection.
#there is justification for looking only at the competitive models; trying to narrow things down.

Plots

Unsure why there are negative values for species richness. When including human population density as a predictive effect, we would want to use a different error structure because the prediction line becomes negative for population densities above 50,000 people per square kilometer